Transcript Document

Design of Flash-Based DBMS :An InPage Logging Approach
Tsung
Computer Science , RUC
Outline







Characteristics of Flash Memory
Main Idea of In-Page Logging
Design Manifesto && Data structure of IPL
Read ,Update && Merge Algorithms of IPL
Without Transaction
Support for transaction
Experiments of IPL
Conclusion
Characteristics of Flash Memory (NAND)
 Structure of NAND Flash Memory
Page(512k) ↓
Erase Unit(16 Pages) ↓
Flash Chip (*G Units) ↓
Flash Memory(**G is available )
 Key hardware limits of Flash Memory
Electronic device (Uniform access speed)
Different granularity (Read/Write page ,Erase unit)
Erase before write (Can’t overwrite)
Sequential write (From page 0 ,1,2  page 15)
Finite number of erase cycles (Typically up to 100,000)
Characteristics of Flash Memory (NAND)
 Impact on software design
No In-Place Update 
How to Avoid && Buffer Update
How to maintain index without high cost
No Mechanical Latency 
Clustering Storage make no sense now ,
we can scatter information around the
memory without substantial penalty
Asymmetric Speed of Read/Write/Erase 
Traditional I/O times make no sense now ,
What’s more ,we can avoid write in order to
avoid erase in the expense of read
Main Idea of In-Page Logging
Use log to buffer
write ,so can combine
writes and decrease
erases
Take advantage of
Uniform sequential and
random write to colocate the page and its
log in one unit
Design Manifesto && Data structure of IPL
 Design Manifesto (Memory Hierarchy = RAM + NAND )
Take advantages of Flash memory
 Access characters (uniform sequential/random access
and faster read )
Overcome the erase-before-write limitation
 Avoid write/erase
Minimize the changes made to DBMS architecture

Changes limited to buffer manage and storage manage
only
 Data Structure of IPL
1. EraseUnit =15Page-basis + 1 log area
0
1
2
3
2. Structure of Buffer and Flash Memory
Read ,Update && Merge Algorithms of IPL
Without Transaction
 Update logic
1.
2.
3.
Update the page-basis and append a log record to it’s log
sector directly (Both in RAM) .
When the log sector is full or the dirty page is evicted
from buffer pool , write the log sector to flash. (The pagebasis isn’t written to flash )
When no free log sector is available , trigger a Merge
operation.
 Read logic
1.
When the page is in buffer pool , return it .
2.
When there is a page fault , read the page and it’s log from
flash ,then apply the changes to the page-basis ,return the
page newly computed.
Merge logic
Support for transaction
 Some concepts
1.
2.
3.
Two buffer management policy = No-force (REDO)/ Force
Three transaction status = Committed / Abort / Active
IPL adopt No-force and avoid REDO with redefined read logic
 Additional log and Data Structure
1.
2.
Global transaction log  status of Transaction
Dirty page list  find transaction ’s page quickly
 Transaction handling idea
T commit  T’s log is applied
T abort  T’s log is ignored
Simple to
handle
New
problems
T active  T’s log isn’t applied now
 Erase Unit’s log sector overflow
+ low merge efficiency
 Append Unit for log
+ Selective merge
View the Support for transaction in operation
perspective
 Update
1.
2.
3.
append a log record to it’s log sector directly (in RAM) ,
change log-sector only.
When the log sector is full or the dirty page is evicted from
buffer pool , or the transaction is committed ,write the log sector
to flash. (The page-basis isn’t written to flash )
When no free log sector is available , trigger a Merge
operation.
 Read
read the page-basis and it’s log sector from flash and applied
the log to the page-basis use the following policy
Committed  Applied
Aborted / Active  Ignored
 Selective Merge
Experiments of IPL
Flash/Disk Access characters In Sequential/Random Patterns
Experimental results show:
Flash is a
electronic
device
Write/Erase
are timeconsuming
Read ---------- Disk is more sensitive than Flash
Write ---------- Flash is more sensitive than Disk
TPC-C Locality Analysis
Setup of TPC-C
(d) The temporal locality of data page updates
---the
Using
logprobability
to
that 16 consecutive physical writes would bebuffer
donewrite
for 16 distinct
data pages is 99.99% , the probability of Unit Erase
is be
93.1%Using buffer
may
efficient
pool isn’t
Experimental results show :
enough
the distribution of updates frequencies was highly skewed.
the temporal locality of page updates is bad
Impacts of different size of log region per erase unit (IPL)
Experimental results show :
Size of log region per Unit
 Write time
Impacts of different size of buffer pool size of database server (IPL)
tConv =(а* page writes)*20 ms
а denote the probability of a write will cause Unit Erase
а is 93.1% according to previous analysis of TPC-C temporal locality
Experimental results show :
Buffer pool size
 Write/Merge times
Conclusion
 This paper is simple but timely
Main contribution : In-page log + Support for transaction
 What is Flash and What is DBMS
 Q&A
Thanks