Abstractions for Fault Tolerant Computing

Download Report

Transcript Abstractions for Fault Tolerant Computing

Lecture 2
Introduction to Principles of
Distributed Computing
Sergio Rajsbaum
Math Institute
UNAM, Mexico
Sergio Rajsbaum 2006
Lecture 2
• Part I: Refresh from Lecture I. What is a
distributed system and its parameters.
Problems solved in such a system. The need
for a theoretical foundation. Two-phase
commit
• Part II: Coordinated attack, consensus
Sergio Rajsbaum 2006
Part I: What is a distributed system
The need for a theoretical foundation.
Two-phase commit
Sergio Rajsbaum 2006
Principles of Distributed Computing
• Distributed computing studies systems where
components interact and collaborate
• Principles of distributed computing tries to
understand the fundamental possibilities and
limitations of such systems, with a precise,
scientific approach
• Goal: to design efficient and reliable systems, and
techniques to design them, analyze them and prove
them correct, or to prove impossibility results when
no protocol exists
Sergio Rajsbaum 2006
What is distributed computing?
• Any system where several independent
computing components interact
• This broad definition encompasses
–
–
–
–
–
–
VLSI chips, and any modern PC
tightly-coupled shared memory multiprocessor
local area cluster of workstations
internet, WEB, Web services
wireless networks, sensor networks, ad-hoc networks
cooperating robots, mobile agents, P2P systems
Sergio Rajsbaum 2006
Computing components
• Referred to processors or processes in the
literature
• Can represent a
–
–
–
–
–
microprocessor
process in a multiprocessing operating system
Java thread
mobile agent, mobile node (e.g. laptop), robot
computing element in a VLSI chip
Sergio Rajsbaum 2006
Interaction – message passing vs.
shared memory
• Processors need to communicate with each other to
collaborate, via
• Message passing
– Point-to-point channels, defining an interconnection
graph
– All-to-all using an underlying infrastructure (e.g.
TCP/IP)
– Broadcast; wireless, satellite
• Shared memory
– Shared-objects: read/write, test&set, compare&swap, etc
– Usually harder to implement, easier to program
Sergio Rajsbaum 2006
A distributed system
processors
collaborate
Communication
media
Sergio Rajsbaum 2006
Failures
• Any system that includes many components
running over a long period of time must
consider the possibility of failures
• of processors and communication media
• of different severity
– from processor crashes or message loses, to
– malicious Byzantine behavior
Sergio Rajsbaum 2006
Many kinds of problems
•
•
•
•
•
•
•
•
•
•
•
Clock synchronization
Routing
Broadcasting
Naming
P2P, how to share and find resources
sharing resources, mutual exclusion
Increasing fault-tolerance, failure detection
Security, authentication, cryptography
Database transactions, atomic commitment
Backups, reliable storage, file systems
Applications, airline reservation, banking, electronic
commerce, publish/subscribe systems, web search, web
caching, …
Sergio Rajsbaum 2006
Multi-layered, complex interactions
An example
• A fault-tolerant broadcast service is useful to build
a higher level database transaction module
• Naming, authentication is required
• And may work more efficiently if clocks are tightly
synchronized
• And good routing schemes should exist
• If the clock synchronization is attacked, the whole
system may be compromised
Sergio Rajsbaum 2006
Chaos
We need a good foundation,
principles of distributed computing
Sergio Rajsbaum 2006
Chaos
• Too many models, problems and orthogonal,
interacting issues
• Very hard to get things right, to reproduce
operating scenarios
• Sometimes it is easy to adapt a solution to a
different model, sometimes a small change in
the model makes a problem unsolvable
Sergio Rajsbaum 2006
Distributed computing theory
• Models
– Good models [Schneider Ch.2 in Distributed Systems, Mullender (Ed.)]
– Relation between models: solve a problem only once; solve it in the
strongest possible model
• Problems
– Search of paradigms that represent fundamental distributed
computing issues
– Relations between problems: hierarchies of solvable and unsolvable
problems; reductions
• Solutions
– Design algorithms, verification techniques, programming
abstractions
– Impossibility results and lower bounds
• Efficiency measures
– Time, communication, failures, recovery time, bottlenecks,
congestion
Sergio Rajsbaum 2006
Distributed Commit
An example of a distributed protocol
Fundamental part of distributed
DBMS
Sergio Rajsbaum 2006
Distributed Commit
• A distributed transaction with components at
several sites should execute atomically
• Example: A manager of a chain of stores wants to
query all the stores, find the inventory of
toothbrushes at each, and issue instructions to move
toothbrushes from store to store in order to balance
the inventory.
• The operation is done by a single global transaction
T that has component Ti at the i-th store and a
component T0 at the office where the manages is
located.
Sergio Rajsbaum 2006
Sequence of activities performed by
T
1.
2.
3.
4.
5.
Component T0 is created at the site of the manager
T0 sends messages to all the stores instructing them to
create components Ti
Each Ti executes a query at store I to discover the number
of toothbrushes in inventory and reports this number to T0
T0 takes these numbers and determines, by some
algorithm we shall not discuss, what shipments of
toothbrushes are desired. T0 then sends messages such as
“store 10 should ship 500 toothbrushes to store 7” to the
appropriate stores
Stores receiving instructions update their inventory and
perform the shipments
Sergio Rajsbaum 2006
Atomicity
• Make sure it does not happen: some of the actions
of T get executed, but others do not
• We do assume atomicity of each Ti, through
mechanisms such as logging and recovery
• Failures make difficult the achievement of
atomicity of T
– A site fails or is disconnected from the network
– A bug in the algorithm to redistribute toothbrushes
instructs store 10 to ship more than it has
Sergio Rajsbaum 2006
Example of failures
• Suppose T10 replies to T0’s 1st message with
its inventory.
• The machine at 10 then crashes, the
instructions form T0 are never received by
T10
• However, T7 sees no problem, and receives
the instructions from T0
• Can distributed transaction T ever commit?
Sergio Rajsbaum 2006
Agreement Paradigms
Coordinated attack
Consensus
Sergio Rajsbaum 2006
Coordinated Attack
An important abstraction
• a pair of allied generals A and B have agreed to
attack simultaneously or not at all.
• they can only communicate via carrier pigeon;
message loss is possible
A
Sergio Rajsbaum 2006
B
Difficulty: uncertainty
• Suppose general A sends the message to B
“attack at dawn”
• general A won’t attack alone. A doesn’t know
whether B has received the message. B
understand A’s predicament, so B sends an
acknowledgment “agreed”
Sergio Rajsbaum 2006
Impossible
Theorem: Assume that communication is
unreliable. Any protocol that guarantees that if
one of the generals attacks, then the other does
so at the same time, is a protocol in which
necessarily neither general attacks.
Did B
get it?
Did A
get it?
“attack at dawn”
A
Sergio Rajsbaum 2006
B
“ack”
A
B
It never ends
• There is always uncertainty of weather the last message was
delivered or not
• Corollary: If decision must be made within a fixed time
period, then unreliable communication prevents
database commitment protocols
Did B
get it?
Did A
get it?
“ack your ack”
A
Sergio Rajsbaum 2006
B
“ack your ack to my ack”
A
B
Agreement Problems in Distributed
Computing are common
Because processes have different
views of its state and history
Sergio Rajsbaum 2006
Agreement Problems in Distributed
Computing are common…
Because processes have different views of its state
and history, due to:
• Delays
• Failures
NASA plunged the Galileo spacecraft into Jupiter’s
turbulent atmosphere today. The unmanned spacecraft
dived into the atmosphere at 2:57 p.m. Eastern time. The
last of Galileo’s data arrived on Earth today after the
spacecraft was destroyed, taking 52 minutes to cross half
a billion miles of space
The New York Times, 21 Sept. 2003
Sergio Rajsbaum 2006
… and Agreement Problems are
Important
• In a replicated data system: to execute the same
sequence of operations on the replicated data
• In a replicated sensor system: to agree on the values
of the sensors
• In a timed system: to synchronize a set of clocks
• In a broadcast system: to deliver the same messages
in the same order
• In a database system: to commit or abort a
transaction
Etc….
Sergio Rajsbaum 2006
Consensus
The king of agreement problems
Sergio Rajsbaum 2006
CONSENSUS
A fundamental Abstraction
Each process has an input, should decide an output s.t.
Agreement: correct processes’ decisions are the same
Validity: decision is input of one process
Termination: eventually all correct processes decide
There are at least two possible input values 0 and 1
Sergio Rajsbaum 2006
A Solution to Consensus
For a group of people sitting in a room
Sergio Rajsbaum 2006
A Solution to Consensus
Each one raises a card with its input
0
1
2
0
Sergio Rajsbaum 2006
0
A Solution to Consensus
Follow a coordinator
0 1
1
1
2
0
Sergio Rajsbaum 2006
1
0 1
1
A Solution to Consensus
Majority wins (breaking ties with the largest)
0 0
1
0
2
0
Sergio Rajsbaum 2006
0
0 0
0
A Solution to Consensus
Failures are no problem (choose another
coordinator, or majority of non-failed)
0
1
2
%!#
Sergio Rajsbaum 2006
0
A Solution to Consensus
… because this cannot happen!!
0
1
2
%!#
1
Sergio Rajsbaum 2006
0
Consensus in Distributed Systems
This can happen: delays
?
?
1
Sergio Rajsbaum 2006
?
Consensus in Distributed Systems
and then there are different views
1
1020
1
0
1020?
2
0
†
Sergio Rajsbaum 2006
1020?
1020?
Consensus in Distributed Systems
so we try to reconcile views- another round
1
1020
1
0
1020?
2
10201
0
†
Sergio Rajsbaum 2006
1020?
1020?
Consensus in Distributed Systems
but we could have the same problem!!
1
1020
1
0
1020?
1020
1
2
10201
0
†
Sergio Rajsbaum 2006
1020?
1020?
So, is consensus solvable?
If so, how long does it take to solve it?
• It depends on what exactly the model is
• But what is a realistic model?
• And what are the common scenarios within the
model? The nature of a distributed system is to
include complex combinations of failures and delays
Sergio Rajsbaum 2006
Basic Model – asynchronous crash
failure model
• Message passing (another option would be a
shared memory model)
• Channels between every pair of processes
• Crash failures, with a bound t
t < n potential failures out of n >1 processes
• No message loss among correct processes
• Unbounded message delays, unpredictable
processor’s speeds
Sergio Rajsbaum 2006
Distributed algorithms
(protocols)
• A set of algorithms, each one runs on a
different processor (or as a thread in the
same computer)
• The code includes instructions to
communicate with other processors:
– Send (M) to p
– Upon receiving a message form q do
Sergio Rajsbaum 2006
A consensus protocol
1. val  input
2. send val to all
3. wait until at least n - t messages have been
received
4. let V[j] be the val received from process j else ‘-’
5. return h (V) = largest value in V
- This same code is executed by every process
- each one receives the value input from some
application
- h is a predefined function, that all processors know
Sergio Rajsbaum 2006
Is this protocol correct ?
• It depends on what is the set C of possible
inputs
• An input to the protocol is a vector I, where
I[j] contains the local input of the j-th
process
• The local input of pj is known only to pj
• And is taken from some universe of possible
values V not including ‘-’
• Let C be the set of possible input vectors to
the protocol
Sergio Rajsbaum 2006
Exercise 1
1. Define a set C as large as possible for which the
protocol is correct
2. Prove that the protocol is correct for this C
3. Do you need to assume t < n / 2 ?
Namely, that for every I in C, in every execution with
input I where at most t processes crash, the
consensus requirements are satisfied
Termination: eventually all correct processes decide
Agreement: correct processes’ decisions are the same
Validity: decision is input of one process
Sergio Rajsbaum 2006
Exercise 2
The protocol uses h (V) = largest value in V
1. Define another such function h’
2. Repeat the previous exercise with respect
to your h’
Sergio Rajsbaum 2006
Exercise 3
Consider the set C that includes every possible
input vector formed with values from V,
where | V | is at least 2
1. Is there a function h for which the protocol
is correct ?
If so, give one such h and prove the protocol is
correct, otherwise, give a brief intuitive
argument of why there is no such h
Sergio Rajsbaum 2006
Bibliography
Theory of distributed computing textbooks
• Attiya, Welch, Distributed Computing,
Wiley-Interscience, 2 ed., 2004
• Garg, Elements of Distributed Computing,
Wiley-IEEE, 2002
• Lynch, Distributed Algorithms, Morgan
Kaufmann,1997
• Tel, Introduction to Distributed Algorithms,
Cambridge U., 2 ed. 2001
Sergio Rajsbaum 2006
Bibliography
others
• Distributed Algorithms and Systems
http://www.md.chalmers.se/~tsigas/DISAS/index.html
• Conferences: DISC, PODC,…
• Journals: Distributed Computing,…
– Special issue PODC 20th anniversary, Sept. 2003
• ACM SIGACT News Distributed Computing
Column. Also one in EATCS Bulletin
Sergio Rajsbaum 2006
Sergio Rajsbaum 2006