Lecture for Chapter 6.4 (Fall 09)

Transcript Lecture for Chapter 6.4 (Fall 09)

6.4 Data And File Replication
Presenter : Jing He
Instructor: Dr. Yanqing Zhang
Outline
• Basic Knowledge
• Most Recent Projects
• Future Works
• References
Outline
• Basic Knowledge
• Most Recent Projects
• Future Works
• References
Why replicate
• Performance
• Reliability
• Resource sharing
• Network resource saving
Challenge
• Transparency
– Parallelism transparency
– Failure transparency
– Replication transparency
• Concurrent Control
• Failure Recovery
Goal
• One-copy serializability:
– The execution of transaction on replicated objects is
equivalent to the execution of the same transactions on
non-replicated objects [1][R. Chow et al. 1997 ].
Architecture
• FSA , File service agent, client interface
• RM, replica manager, provide replication functions
• Client chooses one or more FSA to access data object.
• FSA acts as front end to replica managers RMs to provide replication
transparency.
• FSA contacts one or more RMs for actual updating and reading of data
objects.
Architecture
C lie n t
C lie n t
FSA
RM
RM
FSA
RM
RM
Read operations
• Read-one-primary: FSA only read from a primary
RM to enforce consistency
• Read-one: FSA may read from any RM to gain
concurrency
• Read-quorum: FSA must read from a quorum of
RMs to decide the currency of data
Write Operations
• Write-one-primary: only write to primary RM,
primary RM update all other RMs
• Write-all: update to all RMs
• Write-all- available: write to all functioning RMs.
Faulty RM need to be synched before bring
online.
Write Operations Cont.
• Write-quorum: update to a predefined
quorum of RMs
• Write-gossip: update to any RM and lazily
propagated to other RMs
Read-one-primary,
write-one-primary
• Other RMs are backups of primary RM
• No concurrency
• Easy serialized
• Simple to implement
• Achieve one-copy serializability
• Primary RM is performance bottleneck
Read-one,
Write-all
• Provides concurrency
• Concurrency control protocol needed to ensure consistency
(serialization)
• Achieve one-copy serializability
• Difficult to implement (there will be failed TM to block
any updates)
Read-one,
Write-all-available
• Variation of Read one, Write all
• May not guarantee one-copy serializability
• Issue of lots conflict in transactions
Read-quorum,
Write-quorum
• Version number attached to replicated object
• Highest version numbered object is the latest object in
read.
• Write operation advances version by 1
• Write-write conflict: 2 * Write quorum > all object copies
• Read-write conflict: Write quorum + read quorum > all
object copies
Gossip Update
• Updates are less frequent than reads ,updates can be propagated lazily
to replicas.
• Both read and update operations are directed by FSA to any RM
• FSA shields replication details from clients.
• Increased performance
• Typical read one, write gossip
• Use timestamp
Basic Gossip Update
• Read: if TSfsa<=TSrm, RM has recent data, return
it, otherwise wait for gossip, or try other RM
• Update: if Tsfsa>TSrm, update. Update TSrm send
gossip. Otherwise, process based on application,
perform update or reject
• Gossip: update RM if gossip carries new updates.
Causal Order Gossip Protocol
• Used for read-modify
• In a fixed RM configuration
• Using vector timestamps
• Using buffer to keep the order
Disadvantages of File replication
• Contents of the file needs to be known before
replication operation takes place .
• Existing System cant work in limited bandwidth
networks.
• DFS replication will not work well when there are
large number of changes to replicate.
Outline
• Basic Knowledge
• Most Recent Projects
• Future Works
• References
Current Project
• Data Grid File Replication [2][C. Yang, 2008]
• Create copies in convenient location
• Replicas are adjusted to appropriate locations
using Bavesian Networks (BN)
• File replication in P2P systems
• Plover: making replicas among physically close
nodes; load balance between replica nodes [3][H.
Shen, 2009]
• EAD: efficient and adative decentralized file
replication algorithm[4,5][H. Shen, 2009]
Outline
• Basic Knowledge
• Most Recent Projects
• Future Works
• References
Future Work
• Improve Efficiency and Effectiveness of file
replication scheme
• Integrate File Replication and Consistency
Maintenance
Outline
• Basic Knowledge
• Most Recent Projects
• Future Works
• References
Reference
[1] R. Chow and T. Johnson, Distributed Operating Systems &
Algorithms, 1997
[2] C. Yang, C. Huang, and T. Hsiao, A Data Grid File Relication
Maintenance Strategy Using Bayesian Networks, Eight International
Conference on Intelligent Systems Design and Application, 2008
[3] H. Shen, and Y. Zhu, A proactive low-overhead file replication
scheme for structured P2P content delivery network, Journal Parallel
Distributed Computing, 2009
[4] H. Shen, IRM: Integrated File Replication and Consistency
Maintenance in P2P Systems, IEEE Transactions on Parallel and
Distributed Systems, 2009
[5] H. Shen, An Efficient and Adaptive Decentralized File Replication
Algorithm in P2P File Sharing Systems, IEEE Transactions on
Parallel and Distributed Systems, 2009

Lecture for Chapter 6.4 (Fall 09)

Transcript Lecture for Chapter 6.4 (Fall 09)

Directory