Zookeeper - WordPress.com

Download Report

Transcript Zookeeper - WordPress.com

Zookeeper
Wait-Free Coordination for
Internet-Scale Systems
What is ZooKeeper





Service for coordinating distributed
processes
Wait-free coordination
Enables high-performance server
implementation
Can handle hundreds of thousands
of transactions per second
Distributed system for
implementing distributed systems!
What distributed processes entail





Large number of processes
Heterogeneous hardware
Inter-Process Communication
Asynchronous systems
Network delays
Some Examples

Search engines




Crawling
Indexing
Query Processing
Large-scale data processing



Map-reduce
Hadoop
Dryad
Why is it necessary

Distributed systems need




Configuration Maintenance
Distributed Synchronization
Group Membership
Because



Race Conditions
Deadlocks
Bugs
Introduction




ZooKeeper – Coordination service
Database of meta-data
Relieves distributed systems of its
distributed responsibilities
How?
Elements of ZooKeeper





Replicated in-memory database
Hierarchical DHT
Coarse-grained lock service
Event queue server
Hierarchical Pub/Sub server
Guarantees of ZooKeeper

Serializability

Serializable Reads


All reads from a client are processed in
order
Linearizability

Linearizable Writes

All writes from all clients are processed in
order
Data Model


File system supporting full reads
and writes
Uses znodes




Data objects
Hierarchical ordering
Znodes are unlike files
Does support storing metadata
Data Model
The API







create(path, data, flags)
delete(path, version)
exists(path, watch)
getData(path, watch)
setData(path, data, version)
getChildren(path, watch)
sync(path)
Why multiple functions for a function


Atomicity
Message passing

Three notifications
Exists
-> znode insertion at a path
 getData
-> znode data updates
 getChildren -> znode group broadcasts



Failure detection
Synchronization
The many guarantees of ZooKeeper





Sequential consistency
Atomicity
Reliability
Group revision
Linearizable reads
ZooKeeper Implementation
ZooKeeper Implementation
Request Processor



Provides high availability by
replication
Use atomic broadcast for
coordination in case of writes
If read request, simply generate
response
ZooKeeper Implementation
Request Processor





Replicated database contains entire tree
Maintains logs for recoverability
Clients connect to one server to submit
requests
Transactions are idempotent.
Writes forwarded to one server – leader
ZooKeeper Implementation
ZooKeeper Primitives






Configuration Management
Rendezvous
Group membership
Simple locks
Read / Write locks
Double barrier
Evaluation of ZooKeeper




Variable number of servers, fixed
number of clients.
35 machines simulating 250
simultaneous clients, which all use
the asynchronous API.
Read/write payloads all 1KB in size.
Benchmarking done on the client
side.
Evaluation of ZooKeeper
Evaluation of ZooKeeper
Evaluation of ZooKeeper
1. Failure and recovery
of a follower;
2. Failure and recovery
of a different follower;
3. Failure of the leader;
4. Failure of two
followers (a, b) in the
first two marks,
and recovery at the third
mark (c);
5. Failure of the leader.
Conclusion





Wait-free approach towards
coordinating processes
Used in several applications
Yahoo Message Broker (Pub/Sub)
Hadoop
Katta – Distributed Indexer