Lecture for Chapter 6.4 (Fall 09)
Download
Report
Transcript Lecture for Chapter 6.4 (Fall 09)
6.4 Data And File Replication
Presenter : Jing He
Instructor: Dr. Yanqing Zhang
Outline
• Basic Knowledge
• Most Recent Projects
• Future Works
• References
Outline
• Basic Knowledge
• Most Recent Projects
• Future Works
• References
Why replicate
• Performance
• Reliability
• Resource sharing
• Network resource saving
Challenge
• Transparency
– Parallelism transparency
– Failure transparency
– Replication transparency
• Concurrent Control
• Failure Recovery
Goal
• One-copy serializability:
– The execution of transaction on replicated objects is
equivalent to the execution of the same transactions on
non-replicated objects [1][R. Chow et al. 1997 ].
Architecture
• FSA , File service agent, client interface
• RM, replica manager, provide replication functions
• Client chooses one or more FSA to access data object.
• FSA acts as front end to replica managers RMs to provide replication
transparency.
• FSA contacts one or more RMs for actual updating and reading of data
objects.
Architecture
C lie n t
C lie n t
FSA
RM
RM
FSA
RM
RM
Read operations
• Read-one-primary: FSA only read from a primary
RM to enforce consistency
• Read-one: FSA may read from any RM to gain
concurrency
• Read-quorum: FSA must read from a quorum of
RMs to decide the currency of data
Write Operations
• Write-one-primary: only write to primary RM,
primary RM update all other RMs
• Write-all: update to all RMs
• Write-all- available: write to all functioning RMs.
Faulty RM need to be synched before bring
online.
Write Operations Cont.
• Write-quorum: update to a predefined
quorum of RMs
• Write-gossip: update to any RM and lazily
propagated to other RMs
Read-one-primary,
write-one-primary
• Other RMs are backups of primary RM
• No concurrency
• Easy serialized
• Simple to implement
• Achieve one-copy serializability
• Primary RM is performance bottleneck
Read-one,
Write-all
• Provides concurrency
• Concurrency control protocol needed to ensure consistency
(serialization)
• Achieve one-copy serializability
• Difficult to implement (there will be failed TM to block
any updates)
Read-one,
Write-all-available
• Variation of Read one, Write all
• May not guarantee one-copy serializability
• Issue of lots conflict in transactions
Read-quorum,
Write-quorum
• Version number attached to replicated object
• Highest version numbered object is the latest object in
read.
• Write operation advances version by 1
• Write-write conflict: 2 * Write quorum > all object copies
• Read-write conflict: Write quorum + read quorum > all
object copies
Gossip Update
• Updates are less frequent than reads ,updates can be propagated lazily
to replicas.
• Both read and update operations are directed by FSA to any RM
• FSA shields replication details from clients.
• Increased performance
• Typical read one, write gossip
• Use timestamp
Basic Gossip Update
• Read: if TSfsa<=TSrm, RM has recent data, return
it, otherwise wait for gossip, or try other RM
• Update: if Tsfsa>TSrm, update. Update TSrm send
gossip. Otherwise, process based on application,
perform update or reject
• Gossip: update RM if gossip carries new updates.
Causal Order Gossip Protocol
• Used for read-modify
• In a fixed RM configuration
• Using vector timestamps
• Using buffer to keep the order
Disadvantages of File replication
• Contents of the file needs to be known before
replication operation takes place .
• Existing System cant work in limited bandwidth
networks.
• DFS replication will not work well when there are
large number of changes to replicate.
Outline
• Basic Knowledge
• Most Recent Projects
• Future Works
• References
Current Project
• Data Grid File Replication [2][C. Yang, 2008]
• Create copies in convenient location
• Replicas are adjusted to appropriate locations
using Bavesian Networks (BN)
• File replication in P2P systems
• Plover: making replicas among physically close
nodes; load balance between replica nodes [3][H.
Shen, 2009]
• EAD: efficient and adative decentralized file
replication algorithm[4,5][H. Shen, 2009]
Outline
• Basic Knowledge
• Most Recent Projects
• Future Works
• References
Future Work
• Improve Efficiency and Effectiveness of file
replication scheme
• Integrate File Replication and Consistency
Maintenance
Outline
• Basic Knowledge
• Most Recent Projects
• Future Works
• References
Reference
[1] R. Chow and T. Johnson, Distributed Operating Systems &
Algorithms, 1997
[2] C. Yang, C. Huang, and T. Hsiao, A Data Grid File Relication
Maintenance Strategy Using Bayesian Networks, Eight International
Conference on Intelligent Systems Design and Application, 2008
[3] H. Shen, and Y. Zhu, A proactive low-overhead file replication
scheme for structured P2P content delivery network, Journal Parallel
Distributed Computing, 2009
[4] H. Shen, IRM: Integrated File Replication and Consistency
Maintenance in P2P Systems, IEEE Transactions on Parallel and
Distributed Systems, 2009
[5] H. Shen, An Efficient and Adaptive Decentralized File Replication
Algorithm in P2P File Sharing Systems, IEEE Transactions on
Parallel and Distributed Systems, 2009