COSC 6360 Review Session December 2004

Transcript COSC 6360 Review Session December 2004

COSC 6360 Review Session
December 2004
COSC 6360 Review Session
December 2004
About the second program
It is quite easy to get the timestamps of a file:
System call stat(…) returns a data structure
containing all of them
Data structure defined in
<sys/stat.h>
Example
#include <sys/stat.h>
#include <sys/types.h>
main (int argc, char *argv[]) {
int mymode;
struct stat statbuf;
stat(argv[1], &statbuf);
mymode = statbuf.st_mode%01000;
printf("File mode is %o \n", mymode);
} // main
File Systems
LFS, Journaling file systems and Soft
Updates all address the issue of
metadata updates
What is the metadata update
problem?
Metadata updates (I)
UNIX file system depends on its delayed write
policy to achieve good performance
Writes are recorded in a shared I/O buffer
Recorded on disk when
File blocks are expelled from I/O buffer or
After a few seconds
User and FS has no control over timing or order
of these writes
Metadata updates (II)
Delayed writes are not acceptable for metadata
updates
File crash would leave file system in an
inconsistent state
I/O Buffer
Directory
Block
I-node
Block
First Example
Directory block
contains new entry
pointing to a new
i-node
Cannot write to disk
directory block before
i-node block it points to
Otherwise
I/O Cache
Crash wipes
entire contents
of I/O buffer
On disk
New
Directory
Block
?
Second Example
I/O Buffer
Directory
Block
X
I-node
Block
We removed from
directory block an entry
pointing to an i-node
that is being recycled
Must write to disk
directory block before
i-node block it points to
Otherwise
I/O Buffer
Crash wipes
entire contents
of I/O buffer
On disk
Old
Directory
Block
?
Solutions (I)
Use NVRAM cache
Costly
Require all metadata updates to be written
synchronously in the right order
Solution of UNIX FFS
Makes inefficient use of disk bandwidth
Do all the writes on a log (LFS)
Makes much better use of disk bandwidth
Solutions (II)
Record all metadata updates on a sequential log
before writing them at their regular location
Journaling file systems
Requires additional writes to the log/journal
These writes can be buffered or non-buffered
Ensure that I/O buffer blocks are written back to
disk in the right order
Soft updates
Not always possible
The trouble with Soft Updates
I/O Buffer
Directory
Block
X
I-node
Block
Same directory block
contains a new entry
and an old entry
being deleted
Both entries point to
the same i-node
block
We have a circular
dependency
Soft Update Solution
Do it in three steps
1. Write directory block with old entry deleted but
without new entry
2. Write i-node block with new i-node and old inode already recycled (no directory entry
points to it)
3. Write directory block with old entry deleted and
new entry pointing to new i-node
Disk always remains in a consistent state
Which data structures are
used by LFS?
Key considerations
We can assume that most reads will be completed
without disk access
I-node tables now fragmented into multiple blocks
On-Disk Data Structures
Log
Contains data blocks, i-node blocks, blocks of inode map, segment summaries and directory
change log
Checkpoint area
Contains
Address of end of log at checkpoint time
Addresses of all i-node map blocks at
checkpoint time
In-Memory Data Structures
I/O Buffer
Also contains recently accessed i-node map
blocks
Finding the data (I)
This means
Finding the i-node
Locating the data blocks
If data blocks are in I/O buffer, we are done
Otherwise check whether i-node is not cached
Can reasonably hope that at least the required
i-node block is cached
Finding the data (II)
When nothing can be found in main memory
Go to checkpoint area
Find there address of i-node map blocks
Locate i-node of file
Locate data blocks
After a crash we may have to look up the
portion of the log after the last checkpoint to
locate new i-node map blocks, new i-node
blocks and new data blocks
What should I know about
segment cleaning?
What should I know about
segment cleaning?
Segment Cleaning
Key idea is to group into same segments file of
equal age in order to have
Stable segments that will be rarely cleaned
Segments whose contents change very quickly
Their data age very quickly
These segments will return a lot of free space
whenever they are cleaned
What about Elephant?
Elephant
Key idea is defining which versions to preserve
Two objectives
Being able to undo recent mistakes
Being able to retrieve old versions of a file
Solution
Keep the complete history of a file over a short
period of time (one hour to one week)
Keep forever landmark versions of each file
Example
Complete history of a file
X
X X X
XX
X
Time
Elephant keeps
X
X
Two landmark versions
XX
X
All recent versions
Distributed File Systems
Let us look first at NFS and Coda
What makes a server
stateless?
Stateless server
Keeps no track of previous user requests
Advantages
Robustness
Server can reboot after any crash
Simplicity of design
Disadvantages
Inefficient consistency control
Neither server nor client know whether other
workstations access a given file
Must always assume risk of shared access even
though shared access is infrequent
Requires a write-through policy at server
Reintroducing state
Some state information does not need to be saved
in stable storage:
Temporary data that would expire before the
server can be rebooted
Leases, callbacks
Data that the client keeps in stable storage
Safe asynchronous writes allow NSF server
to delay writes until client commits them
What is close to open
consistency?
Close to open consistency
Guarantees that every process opening a file will
see all the changes brought to the file by the last
process that closed the file
Processes must
Check at open time whether they have the most
recent version of the file known to server
Propagate all their changes to the server when
they close the file
What are callbacks?
Callbacks
Are promises made by the server to notify a client
when it receives a new version of the file from any
other client
A client having a callback on a file does not need
to check with the server whether it has the most
recent version of the file known to server
Notifications can be lost
Clients must periodically check the validity of
their callbacks
Why do we have callbacks?
Callbacks
Reduce the server workload whenever most files
are not shared
The server bets that it will never have to break
the callback
 “Do not call us, we will call you!”
What is the NFS model of
consistency?
NFS
Clients
Frequently check the validity of the data blocks
in their cache
Frequently send to the server the new values of
the blocks they have modified
Client and server also enforce close to open
consistency
How does Coda detects
inconsistencies?
Coda
Each replica has
ID of last store (LSID)
A current version vector (CVV) with
The version number of the replica
Conservative estimates of the version
numbers of the other replicas
Example
Three copies
A:
LSID= 33345
B:
LSID= 33345
C:
LSID= 2235
v=4
CVV = (4 4 3)
v=4
CVV = (4 4 3)
v=3
CVV = (3 3 3)
Coda (II)
Coda compares the states of replicas by
comparing their LSID’s and CVV’s
Four outcomes can be
Strong equality: same LSID’s and same
CVV’s
Everything is fine
Coda (III)
Weak equality:
Same LSID’s and different CVV’s
Happens when one site was never notified
that the other was updated
Must fix CVV’s
Coda (IV)
Dominance /Submission: LSID’s are different and
every element of the CVV of a replica is greater
than or equal to the corresponding element of the
CVV of the other replica
Example: two replicas A and B
CVVA = (4 3)
A dominates B
CVVB = (3 3)
B is dominated by A
A has the most recent version of the file
Coda (V)
Inconsistency: LSID’s are different and some
element of the CVV of a replica are greater than
the corresponding elements of the CVV of the other
replica but other are smaller
Example: two replicas A and B
CVVA = (4 2)
A and B are
CVVB = (2 3)
inconsistent
Must fix inconsistency before allowing access
to the file
What is the key idea in
the LBFS paper?
LBFS
Their use of variable-size chunks defined by their
contents rather than fixed-size file blocks
Can identify same data even when they start at
different offsets
File
After insertion
The two files do not have a single block in common
What is the key idea in
the Farsite paper?
Farsite
Builds a trusted distributed file system out of
untrustworthy workstations
Two techniques
Replicate encrypted copies of individual data
blocks on several hosts
Use a distributed directory server that tolerate
Byzantine, i. e., malicious failures
What is the key idea in
the LOCKSS paper?
LOCKSS
The way they organize polls
Costly for the poll initiator
Only affects the contents of the AU of the initiator
of the poll
Their concept of inner circle and outer circle peers
Complicates the task of the malicious peers
What should I know about
MEMS?
MEMS
Data are on sled moving over large number of
probes
Improves bandwidth
Does not require the sled to move very far
Three technologies
Magnetic recording (CMU)
Melting polymer (IBM)
Phase recording (HP)