Transcript PowerPoint

Discussion Class 10
The Google File System
1
Discussion Classes
Format:
Question
Ask a member of the class to answer.
Provide opportunity for others to comment.
When answering:
Stand up.
Give your name. Make sure that the TA hears it.
Speak clearly so that all the class can hear.
Suggestions:
Do not be shy at presenting partial answers.
Differing viewpoints are welcome.
2
Question 1: File System
(a) What is the purpose of a file system, such as the Google
File System?
(b) How does a file system interact with the other parts of a
computer system?
(c) Why does Google not use the standard Linux file
system?
(d) Why does Google not use the Posix API?
(e) What does Google lose by having its own file system?
3
Question 2: Interface
In addition to the standard operations of create,
delete, open, close, read, and write, the Google File
System has two special operations:
snapshot
record append
(a) What do these two operations do?
(b) Give examples of application that these
operations are designed to support.
4
Question 3: Performance
Discuss these three performance graphs.
5
Question 4: Component Failure
In the Google File System, data is stored on chunk servers.
Suppose that a chunk server has a disk crash.
(a) What precautions are taken to ensure that this does not
result in a total loss of data?
(b) What is the recovery process after a chunk server fails?
(c) What is the role of the master server in this recovery?
(d) What if the master server also fails?
6
Question 5: Recovery Experiment
The paper describes two recovery experiments on a cluster
containing 15,000-16,000 chunks with 600-660 GB of data.
(a) One chunk server was killed. Recovery time was 23
minutes. Describe the process.
(b) Two chunk server were killed. Explain how the recovery
process was changed.
(c) Suppose that the master server had been killed. How would
the recovery have taken place?
7
Question 6: Master Servers
The Google File System uses clusters each consisting of
one master server and several chunk servers.
(a) What role does each play?
(b) What are the advantages of this architecture?
(c) What precautions are taken to prevent the master server
becoming a bottleneck?
8
Question 7: Consistency Model
(a) What is a consistency model?
(b) How does the consistency model used by the Google File
System differ from the model that might be used by a
general purpose file system?
(c) How does the file system guarantee correctness of the
namespace?
(d) Explain what happens, from a consistency viewpoint,
when an application attempts to append to a file.
9