Chapter 7 – Consistency Protocols

Download Report

Transcript Chapter 7 – Consistency Protocols

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN

Chapter 7 Consistency And Replication (CONSISTENCY PROTOCOLS)

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 1

• • • •

CONSISTENCY PROTOCOLS

Continuous Consistency - Bounding Numerical Deviation - Bounding Staleness Deviations - Bounding Ordering Deviations Sequential Consistency - Primary-based Protocol - Remote-write Protocol - Local-write Protocol - Replicated-write Protocol - Active Replication - Majority Voting (Quorum-Based) Cache-Coherence Protocols Implementing Client-Centric Consistency 2 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

consistency protocol

• A consistency protocol describes an implementation of a specific consistency model.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 3

Continuous Consistency

• • • • • • Bounding Numerical Deviation: A solution for keeping the numerical deviation within bounds We concentrate on writes to a Each write

W(x)

single data item

has an associated x

weight that represents the numerical value by which

x is updated,

denoted as

weight (W)

For simplicity, we assume that Each write

W is weight (W) > 0 initially submitted to one .

out of the N available

replica servers, in which case that server becomes the write's origin, denoted as

origin (W)

each server

S i will keep track of a log L i performed on its own

local copy of

x .

of writes that it has

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 4

• • • • • •

Continuous consistency: Numerical deviation

Consider a data item x and let weight(W) denote the numerical change in its value after a write operation W . Assume that weight(W) > 0 .

W is initially forwarded to one of the N replicas, denoted as origin(W) .

TW[ i,j ] are the writes from S j executed by server S i that originated TW[ i , j ] = ∑{weight(W) | origin(W) = S j & W Є log(S i )} The goal is for any time

t , to let the current value V i at server S i deviate within bounds from

the actual value v

(t) of x .

This actual value is completely determined by all submitted writes 5 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Continuous consistency: Numerical deviation

Actual value v(t) of x: N v(t) = v (0) +

TW[k,k] k=1

value v

i

of x at replica

i N v(i) = v (0) +

TW[i,k] k=1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 6

Continuous consistency: Numerical deviation

• • • for every server

S i , we associate an upperbound

δ i

such that we need to enforce:

v(t) – v i ≤ δi for every server S i when a server

S

i

propagates

a write originating from S j to S k the latter will be able to learn about the value

TW [i,j] at the time the write was sent

S k maintains a view TW k [ i , j ] of what it believes S i will have as value for TW[ i,j ]. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 7

Continuous consistency: Numerical deviation

Solution

S ĸ sends operations from its log to S i when it sees that TW ĸ [ i , k ] is getting too far from TW[ k , k ], in particular, when TW[ k , k ] - TW ĸ [ i , k ] > δ i / (N -1) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 8

• • •

Continuous consistency: Numerical deviation

when server

S k notices that S i has not been keeping in

the right pace with the updates that have been submitted to S k it forwards writes from its log to

Si . This forwarding effectively advances the view

TW ĸ [ i , k ]

that S k

has of TW[ i , k ],

making the deviation

TW[ i , k ]

-

TW ĸ [ i , k ]

smaller. In particular, S k advances its view on

TW[ i , k ]

when an application submits a new write that

would increase

TW[ k , k ] -

TW ĸ [ i , k ] b

eyond

δ i / (N -1). Advancement always ensures that v(t) – v i ≤ δi Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 9

• • • •

Bounding Staleness Deviations

There are many ways to keep the staleness of replicas within specified bounds One approach is to timestamp each submitted write by its origin server let server S k

keep a real-time vector clock RVC

k

where: RVC

k

[i] = T(i) means that S

k

has seen all writes that have been submitted

to

S i up to time T(i) T(i) denotes the time local to S i

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 10

• • • •

Bounding Staleness Deviations

If the clocks between the replica servers are loosely synchronized: Whenever server S k

notes that T( k ) - RVC

k

[ i ] is about to exceed a specified limit it simply

starts pulling in writes that originated from

S i timestamp later than RVC

k

[ i ] · with a

a replica server is responsible for keeping its copy of

x

up to date regarding writes that have been issued elsewhere Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 11

Numerical VS Staleness Bounds

• Numerical bounds follows a push approach , by letting an origin server keep replicas up to date by forwarding writes • Staleness Bounds follows a Pull approach , Replica servers pull in updates from origin servers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 12

• • • • •

Bounding Ordering Deviations

ordering deviations in continuous consistency are caused by the fact that a replica server tentatively applies updates that have been submitted to it.

each server will have a local queue of tentative writes for which the actual order in which they are to be applied to the local copy of

x still needs to be

determined The ordering deviation is bounded by specifying the maximal length of the queue of tentative writes enforce a globally consistent ordering of tentative writes Using primary-based or quorum-based protocols Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 13

• • • •

Primary-Based Protocols

each data item

x in the data store has an associated primary ,

which is responsible for coordinating write operations on

x

provide a straightforward implementation of sequential consistency all processes see all write operations in the same order , no matter which backup server they use to perform read operations A distinction can be made as to whether: - the primary is fixed at a remote server - write operations can be carried out locally after moving the primary to the process where the write operation is initiated Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 14

• • • • •

Remote- Write Protocols

all write operations need to be forwarded to a fixed single server Read operations can be carried out locally Also called (primary-backup protocol), Fig. 7-20 Disadvantage : it may take a relatively long time before the process that initiated the update is allowed to continue, an update is implemented as a blocking operation Alternative: nonblocking approach Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 15

Remote-Write Protocols

Figure 7-20. The principle of a primary-backup protocol.

16 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

• • •

Remote- Write Protocols

nonblocking approach: As soon as the primary has updated its local copy of

x , it returns an acknowledgment. After that, it tells the backup servers to perform

the update as well Advantage : write operations may speed up considerably Disadvantage : fault tolerance, updates may not be backed up by other servers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 17

• • •

Local- Write Protocols

when a process wants to update data item

x , it locates the primary copy of x ,

and subsequently moves it to its own location (Fig. 7-21) Advantage (in nonblocking protocol only): multiple, successive write operations can be carried out locally, while reading processes can still access their local copy updates are propagated to the replicas after the primary has finished with locally performing the updates Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 18

Local-Write Protocols

Figure 7-21. Primary-backup protocol in which the primary Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Local- Write Protocols

• • • Example primary-backup protocol with local writes Mobile computing in disconnected mode (ship all relevant files to user before disconnecting, and update later on) While being disconnected, all update operations are carried out locally. while other processes can still perform read operations (but no updates).

Later, when connecting again, updates are propagated from the primary to the backups, bringing the data store in a consistent state again Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 20

Local- Write Protocols

Example primary-backup protocol with local writes • • • distributed file systems that require a high degree of fault tolerance a fixed central server through which all write operations take place The server temporarily allows one of the replicas to perform a series of local updates to speed up performance When the replica server is done, the updates are propagated to the central server, from where they are then distributed to the other replica servers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 21

• •

Replicated-Write Protocols

write operations can be carried out at multiple replicas instead of only one A distinction can be made between: active replication , in which an operation is forwarded to all replicas majority voting (Quorum-Based Protocols) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 22

• • • • •

Active Replication

each replica has an associated process that carries out update operations operations need to be carried out in the same order everywhere.

totally-ordered multicast mechanism. Such a multicast can be implemented using Lamport's logical clocks, Disadvantage : this implementation of multicasting does not scale well in large distributed systems.

total ordering can be achieved using a central coordinator, also called a sequencer .

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 23

Active Replication

• • first forward each operation to the sequencer , which assigns it a unique sequence number and subsequently forwards the operation to all replicas Operations are carried out in the order of their sequence number.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 24

• •

Quorum-Based Protocols

A different approach to supporting replicated writes is to use voting clients to request and acquire the permission of multiple servers before either reading or writing a replicated data item Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 25

Quorum-Based Protocols

• • • how the algorithm works?

a file is replicated on to update a file

N servers

, a client must first contact at least half the servers plus one (a majority) and get them to agree to do the update.

Once they have agreed, the file is changed and a new version number is associated with the new file.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 26

Quorum-Based Protocols

how the algorithm works?

• To read a replicated file, a client must also contact at least half the servers plus one and ask them to send the version numbers associated with the file.

• If all the version numbers are the same, this must be the most recent version Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 27

Quorum-Based Protocols

• • • to read a file of which

N replicas exist, a client needs to assemble a read quorum , an arbitrary

collection of any

N R servers, or more to modify a file, a write

quorum of at least

N w servers is required. The values of N R

two constraints:

and N w are subject to

the following N R +N W > N and N W > N / 2 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 28

Quorum-Based Protocols

Figure 7-22. Three examples of the voting algorithm. (a) A correct choice of read and write set. (b) A choice that may lead to write-write conflicts. (c) A correct choice, known as ROWA (read one, write all).

29 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Cache-Coherence Protocols

• • cache-coherence protocols ensures that a cache is consistent with the server-initiated replicas coherence detection strategy (when) coherence enforcement strategy (How) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 30

• • • • •

Cache-Coherence Protocols

Caches are controlled by clients instead of servers much research in the design and implementation of caches caching solutions differ in their coherence detection strategy,

when inconsistencies are detected.

Dynamic solutions : inconsistencies are detected at runtime. For example, a check is made with the server to see whether the cached data have been modified since they were cached In the case of distributed databases , dynamic detection based protocols can be further classified by considering exactly when during a transaction the detection is done Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 31

Implementing Client-Centric Consistency

• • • • A Naive Implementation: each write operation

W is assigned a globally unique identifier . Such an identifier is assigned by the

server to which the write had been submitted ( Origin of W ).

for each client The read set , a track of two sets of writes: for a client consists of the writes relevant for the read operations performed by a client The write set consists of the performed by the client.

identifiers of the writes Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 32

Implementing Client-Centric Consistency

• • • • • Monotonic-read consistency: When a client performs a read operation at a server, that server is handed the client's read set to check whether all the identified writes have taken place locally (The size of the set may introduce a performance problem up to date before carrying out the read operation ) If not, it contacts the other servers to ensure that it is brought Alternatively, the read operation is forwarded to a server where the write operations have already taken place After the read operation is performed, the write operations that have taken place at the selected server and which are relevant for the read operation are added to the client‘s read set 33 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Implementing Client-Centric

• • •

Consistency

Monotonic-write consistency: When a client initiates a new write operation at a server, the server is handed over the client's write set It then ensures that the identified write operations are performed first and in the correct order After performing the new operation, that operation's write identifier is added to the write set Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 34

Implementing Client-Centric

• •

Consistency

read-your-writes consistency : requires that the server where the read operation is performed has seen all the write operations in the client's write set The writes can be fetched from other servers before the read operation is performed Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 35

Implementing Client-Centric Consistency

writes-follow-reads consistency: • implemented by first bringing the selected server up to date with the write operations in the client's read set • Then, later adding the identifier of the write operation to the write set , along with the identifiers in the read set Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 36

Improving Efficiency

• read set and write set associated with each client can become very large • • • Solution: a client's read and write operations are grouped into sessions A session is typically associated with an application : it is opened when the application starts and is closed when it exits Whenever a client closes a session , the sets are cleared Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 37

• • • •

Summary

Consistency protocols describe specific implementations of consistency models a distinction can be made between primary-based protocols and replicated-write protocols In primary-based protocols , all update operations are forwarded to a primary copy that subsequently ensures the update is properly ordered and forwarded In replicated-write protocols , an update several replicas at the same time is forwarded to Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 38