100% found this document useful (1 vote)

257 views42 pages

ZooKeeper: Distributed Coordination Service

ZooKeeper is a highly available coordination service for distributed applications. It was originally developed at Yahoo! and is now a top-level Apache project. ZooKeeper provides a hierarchical namespace and basic primitives such as locks and leader election that distributed applications can use to coordinate processes. It guarantees consistency and reliability through replication and a write-ahead log.

Uploaded by

Tripti Sagar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

257 views42 pages

ZooKeeper: Distributed Coordination Service

Uploaded by

Tripti Sagar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

ZooKeeper

Claudia Hauff

1
ZooKeeper
A highly-available service for coordinating
processes of distributed applications.

• Developed at Yahoo! Research

• Started as sub-project of Hadoop, now a top-level

Apache project

• Development is driven by application needs

2 http://zookeeper.apache.org/
ZooKeeper in the Hadoop
ecosystem

Pig Hive Sqoop

(Data Flow) (SQL) (Data Transfer)
ZooKeeper
(Coordinatio

(Serializatio
MapReduce (Job Scheduling/Execution)

Avro
HBase (Column DB)
n)

n)
HDFS

3
Coordination

Proper coordination
is not easy.

4
Fallacies of distributed
computing
• The network is reliable

• There is no latency

• The topology does not change

• The network is homogeneous

• The bandwidth is infinite

• …
5
Motivation
• In the past: a single program running on a single
computer with a single CPU

• Today: applications consist of independent programs

running on a changing set of computers

• Difficulty: coordination of those independent programs

• Developers have to deal with coordination logic and

application logic at the same time

ZooKeeper: designed to relieve developers from

writing coordination
9 logic code.
Lets think ….
Question: how do you elect the leader?

application logic

A program that
crawls the Web

one machine (the leader)

should coordinate the effort
coordination logic
a cluster with a
few hundred machines
11
Question: how do you lock a service?

application logic

A program that
crawls the Web

The progress of the crawl

is stored in a DB: who
accesses what & when?
a cluster with a coordination logic
few hundred machines
one database 12
Question: how can the configuration be
distributed?

configuration file
application logic

A program that
crawls the Web

Every worker should

start with the same
configuration
a cluster with a coordination logic
few hundred machines
13
Solution approaches
• Be specific: develop a particular service for each
coordination task
• Locking service

• Leader election

• etc.

• Be general: provide an API to make many services

possible
ZooKeeper The Rest
API that enables application specific primitives are
developers to implement implemented on the
their own primitives
15 easily server side
How can a distributed
system look like?
MASTER

Slave Slave Slave Slave

+ simple
- coordination performed by the master
- single point of failure
- scalability
How can a distributed
system look like?

+ not a single point of failure anymore

- scalability is still an issue
How can a distributed
system look like?

+ scalability
What makes distributed
system coordination difficult?
Partial failures make application writing difficult

message

nothing comes back network failure

Sender does not know:

• whether the message was received
• whether the receiver’s process died before/after
processing the message
19
Typical coordination problems
in distributed systems
• Static configuration: a list of operational parameters for the
system processes

• Dynamic configuration: parameter changes on the fly

• Group membership: who is alive?

• Leader election: who is in charge who is a backup?

• Mutually exclusive access to critical resources (locks)

• Barriers (supersteps in Giraph for instance)

The ZooKeeper API allows us to implement all these coordination

tasks20easily.
ZooKeeper principles
ZooKeeper’s design
principles
• API is wait-free Remember the
dining philosophers,
• No blocking primitives in ZooKeeper
forks & deadlocks.
• Blocking can be implemented by a client
• No deadlocks

• Guarantees
• Client requests are processed in FIFO order

• Writes to ZooKeeper are linearisable

• Clients receive notifications of changes before the

changed data becomes visible
18
ZooKeeper’s strategy to be
fast and reliable
• ZooKeeper service is an ensemble of servers that
use replication (high availability)

• Data is cached on the client side:

Example: a client caches the ID of the current leader
instead of probing ZooKeeper every time.

• What if a new leader is elected?

• Potential solution: polling (not optimal)
• Watch mechanism: clients can watch for an
update of a given data object ZooKeeper is optimised for
read-dominant operations!
19
ZooKeeper terminology
• Client: user of the ZooKeeper service

• Server: process providing the ZooKeeper service

• znode: in-memory data node in ZooKeeper,

organised in a hierarchical namespace (the data tree)

• Update/write: any operation which modifies the state

of the data tree

• Clients establish a session when connecting to

ZooKeeper
20
ZooKeeper’s data model:
filesystem
• znodes are organised in a hierarchical namespace

• znodes can be manipulated by clients through the

ZooKeeper API

• znodes are referred to by UNIX style file system

paths /

/app1 /app2

/app1/p_1 /app1/p_2 /app1/p_3

All znodes store data (file like) & can have
children (directory like).
21
znodes
• znodes are not designed for general data storage
(usually require storage in the order of kilobytes)

• znodes map to abstractions of the client

application
Group membership protocol:
Client process pi creates znode p_i
under /app1. /
/app1 persists as long as the process
/app1 /app2
is running.

/app1/p_1 /app1/p_2 /app1/p_3

22
znode flags
• Clients manipulate znodes by creating and
deleting them
ephemeral (Greek): passing, short-lived

• EPHEMERAL flag: clients create znodes which

are deleted at the end of the client’s session

• SEQUENTIAL flag: monotonically increasing

counter appended to a znode’s path;
counter value of a new znode under a parent is
always larger than value of existing children
/app1_5
create(/app1_5/p_, data, SEQUENTIAL)

/app1_5/p_1 /app1_5/p_2 /app1_5/p_3

znodes & watch flag
• Clients can issue read operations on znodes with a
watch flag

• Server notifies the client when the information on the

znode has changed

• Watches are one-time triggers associated with a

session (unregistered once triggered or session closes)

• Watch notifications indicate the change, not the new

data

24
Sessions
• A client connects to ZooKeeper and initiates a
session

• Sessions have an associated timeout

• ZooKeeper considers a client faulty if it does not

receive anything from its session for more than that
timeout

• Session ends: faulty client or explicitly ended by

client
25
A few implementation details
ZooKeeper data is replicated on each server that
composes the service
replicated across
all servers
(in-memory)

updates first
logged to disk;
write-ahead log
write request requires and snapshot
coordination between servers for recovery

Source: http://bit.ly/13VFohW 30
A few implementation details
• ZooKeeper server services clients

• Clients connect to exactly one server to submit

requests
• read requests served from the local replica

• write requests are processed by an agreement

protocol (an elected server leader initiates
processing of the write request)

31
Lets work through
some examples
No partial read/writes
(no open, seek or
close methods).

ZooKeeper API
• String create(path, data, flags)
• creates a znode with path name path, stores data in it and sets flags
(ephemeral, sequential)

• void delete(path, version)

• deletes the anode if it is at the expected version

• Stat exists(path, watch)

• watch flag enables the client to set a watch on the znode

• (data, Stat) getData(path, watch)

• returns the data and meta-data of the znode

• Stat setData(path, data, version)

• writes data if the version number is the current version of the znode

• String[] getChildren(path, watch)

29Note:
no or similar methods.
createLock()
Example: configuration
• String create(path, data, flags)
• void delete(path, version)
• Stat exists(path, watch)
Questions: • (data, Stat) getData(path, watch)
1.How does a new worker query ZK • Stat setData(path, data, version)
for a configuration? • String[] getChildren(path, watch)
2. How does an administrator change
the configuration on the fly?
3. How do the workers read the new
configuration? /

[configuration stored in /app1/config] /app1 /app2

1.getData(/app1/config,true) app configuration
2.setData(/app1/config/config_data,-1)
[notify watching clients]
3. getData(/app1/config,true)

/app1/config /app1/progress
30
Example: group • String create(path, data, flags)

membership •
•
void delete(path, version)
Stat exists(path, watch)
• (data, Stat) getData(path, watch)
• Stat setData(path, data, version)
Questions: • String[] getChildren(path, watch)
1.How can all workers (slaves) of an
application register themselves on ZK? /
2. How can a process find out about all
active workers of an application?
/app1
[a znode is designated to store workers]
1.create(/app1/workers/
worker,data,EPHEMERAL) /app1/workers
2. getChildren(/app1/workers,true)

/app1/workers/worker1 /app1/workers/worker2

31
• String create(path, data, flags)
Example: •
•
void delete(path, version)
Stat exists(path, watch)

simple locks •
•
(data, Stat) getData(path, watch)
Stat setData(path, data, version)
• String[] getChildren(path, watch)

Question:
1. How can all workers of an application use a single resource through
a lock?

create(/app1/lock1,…,EPHE.) /
/app1

yes /app1/workers
ok? use locked resource
/app1/lock1

/app1/workers/worker1 /app1/workers/worker2
getData(/app1/lock1,true)
all processes compete at all times for the lock
36
Example:
locking without herd effect
id=create(/app1/locks/lock_,SEQ.|EPHE.)

ids = getChildren(/app1/locks/,false)

/
yes
id=min(ids)? exit (use lock)

/app1
no
/app1/locks
exists(max_id<id,true)

/app1/locks/lock_1 /app1/locks/lock_2
wait for notification

Question:
1. How can all workers of an application use a single resource through
a lock? 37
• String create(path, data, flags)
void delete(path, version)
Example: •
• Stat exists(path, watch)
• (data, Stat) getData(path, watch)
leader election •
•
Stat setData(path, data, version)
String[] getChildren(path, watch)

Question:
1. How can all workers of an application elect a leader among
themselves?

getData(/app1/workers/leader,true)
/

ok? follow
yes /app1

create(/app1/workers/leader,IP,EPHE.)
/app1/workers

no
ok? lead /app1/workers/leader /app1/workers/worker1
yes

if the leader dies, elect again (“herd effect”)

38
ZooKeeper
applications
The Yahoo! fetching service
• Fetching Service is part of Yahoo!’s crawler infrastructure

• Setup: master commands page-fetching processes

• Master provides the fetchers with configuration
• Fetchers write back information of their status and health

• Main advantage of ZooKeeper:

• Recovery from master failures
• Guaranteed availability despite failures

• Used primitives of ZK: configuration metadata, leader

election

36
Yahoo! message broker
• A distributed publish-subscribe system

• The system manages thousands of topics that clients

can publish messages to and receive messages from

• The topics are distributed among a set of servers to

provide scalability

• Used primitives of ZK: configuration metadata (to

distribute topics), failure detection and group
membership

37
Yahoo! message broker
primary and backup
server per topic;
topic subscribers
monitored by
all servers

ephemeral nodes

Source: http://bit.ly/13VFohW 38
Throughput
Setup: 250 clients, each client has at least 100
outstanding requests (read/write of 1K data)

crossing eventually
always happens

only write only read

requests requests

Source: http://bit.ly/13VFohW 39
Recovery from failure
Setup: 250 clients, each client has at least 100
outstanding requests (read/write of 1K data);
5 ZK machines (1 leader, 4 followers), 30% writes

(1)failure & recovery of

a follower
(2)failure & recovery of
a different follower
(3) failure of the leader
(4)failure of followers
(a,b), recovery at (c)
(5) failure of the leader
(6)recovery of the
leader

Source: http://bit.ly/13VFohW
44
References
• [book] ZooKeeper by Junqueira & Reed, 2013
(available on the TUD campus network)

• [paper] ZooKeeper: Wait-free coordination for Internet-

scale systems by Hunt et al., 2010; http://bit.ly/
13VFohW
41
Summary
• Whirlwind tour through ZooKeeper

• Why do we need it?

• Data model of ZooKeeper: znodes

• Example implementations of different coordination

tasks

ZooKeeper: Cluster Coordination Guide
No ratings yet
ZooKeeper: Cluster Coordination Guide
13 pages
Apache Zookeeper
No ratings yet
Apache Zookeeper
31 pages
Introduction to Cassandra Basics
No ratings yet
Introduction to Cassandra Basics
27 pages
Spring Boot Microservices Workshop
No ratings yet
Spring Boot Microservices Workshop
4 pages
Apache Kafka
No ratings yet
Apache Kafka
17 pages
Real Time Data Processing With PDI
No ratings yet
Real Time Data Processing With PDI
15 pages
Cloud Native Computing Overview
No ratings yet
Cloud Native Computing Overview
31 pages
Big Data Exam Questions and Exercises
100% (1)
Big Data Exam Questions and Exercises
6 pages
Spring Framework Interview Questions Guide
No ratings yet
Spring Framework Interview Questions Guide
20 pages
Describe The Functions and Features of HDP
100% (2)
Describe The Functions and Features of HDP
16 pages
Apache Cassandra Sample Resume
No ratings yet
Apache Cassandra Sample Resume
17 pages
Lesson 18 Camunda BPMN Business Rule Task TECH BUZZ BLOGS
No ratings yet
Lesson 18 Camunda BPMN Business Rule Task TECH BUZZ BLOGS
38 pages
Microservices for Software Engineers
No ratings yet
Microservices for Software Engineers
6 pages
SHIVA KUMARA - JavaArchitect
No ratings yet
SHIVA KUMARA - JavaArchitect
9 pages
Deployed in Any Application: What Do You Know About Microservices?
100% (1)
Deployed in Any Application: What Do You Know About Microservices?
5 pages
Advanced Java Tutorial Servlet
0% (1)
Advanced Java Tutorial Servlet
4 pages
Improve Hibernate Performance with Caching
No ratings yet
Improve Hibernate Performance with Caching
161 pages
Kafka Architecture and Terminology Guide
No ratings yet
Kafka Architecture and Terminology Guide
45 pages
JVM (Java Virtual Machine)
No ratings yet
JVM (Java Virtual Machine)
34 pages
SQL Queries and Java Programs Guide
No ratings yet
SQL Queries and Java Programs Guide
73 pages
MapReduce for Big Data Enthusiasts
No ratings yet
MapReduce for Big Data Enthusiasts
18 pages
Java Interview Questions & Answers PDF
No ratings yet
Java Interview Questions & Answers PDF
11 pages
Hibernate Guide for Java Developers
No ratings yet
Hibernate Guide for Java Developers
17 pages
Java, Spring Boot, Microservices, and Angular
No ratings yet
Java, Spring Boot, Microservices, and Angular
38 pages
Spring Boot API Security Guide
100% (1)
Spring Boot API Security Guide
72 pages
Splunk Developer's Guide Overview
No ratings yet
Splunk Developer's Guide Overview
16 pages
MVC Questions & Answers
No ratings yet
MVC Questions & Answers
162 pages
DS Notes
No ratings yet
DS Notes
170 pages
Docker and Kubernetes
No ratings yet
Docker and Kubernetes
45 pages
Optimizing Hive Join Strategies
No ratings yet
Optimizing Hive Join Strategies
6 pages
UML Diagram Case Study: Wijak Srisujjalertwaja
No ratings yet
UML Diagram Case Study: Wijak Srisujjalertwaja
20 pages
BIG DATA & Hadoop Interview Questions With Answers
No ratings yet
BIG DATA & Hadoop Interview Questions With Answers
9 pages
Spring Framework Comprehensive Guide
No ratings yet
Spring Framework Comprehensive Guide
42 pages
Spring RabbitMQ For High Load
No ratings yet
Spring RabbitMQ For High Load
50 pages
Hibernate Questions
No ratings yet
Hibernate Questions
65 pages
Advantages of Microservices Architecture
No ratings yet
Advantages of Microservices Architecture
21 pages
DSA by Shradha Didi & Aman Bhaiya - Bonus DSA Questions
No ratings yet
DSA by Shradha Didi & Aman Bhaiya - Bonus DSA Questions
2 pages
Domain-Driven Design (DDD) - Excelente
No ratings yet
Domain-Driven Design (DDD) - Excelente
7 pages
Spring AOP: Concepts and Examples
No ratings yet
Spring AOP: Concepts and Examples
18 pages
Producer Consumer Using Blocking Queue:: Defining The Problem
No ratings yet
Producer Consumer Using Blocking Queue:: Defining The Problem
6 pages
Java Performance in Numerical Computing
No ratings yet
Java Performance in Numerical Computing
44 pages
Understanding Microservices Architecture
No ratings yet
Understanding Microservices Architecture
18 pages
Spring Data JPA with Spring Boot Guide
100% (2)
Spring Data JPA with Spring Boot Guide
12 pages
Introduction to Hadoop HDFS
No ratings yet
Introduction to Hadoop HDFS
9 pages
Hibernate Association Mapping Annotations
No ratings yet
Hibernate Association Mapping Annotations
4 pages
Producer vs Consumer in Web Development
No ratings yet
Producer vs Consumer in Web Development
10 pages
Chapter 1: Getting Started With Spring Framework: Section 1.1: Setup (XML Configuration)
No ratings yet
Chapter 1: Getting Started With Spring Framework: Section 1.1: Setup (XML Configuration)
64 pages
Software Engineering Interview Prep Guide
No ratings yet
Software Engineering Interview Prep Guide
96 pages
Overview of Apache Kafka Messaging System
No ratings yet
Overview of Apache Kafka Messaging System
36 pages
Spring Microservices: Online Training
No ratings yet
Spring Microservices: Online Training
4 pages
Class: CS 237 Distributed Systems Middleware Instructor: Nalini Venkatasubramanian
No ratings yet
Class: CS 237 Distributed Systems Middleware Instructor: Nalini Venkatasubramanian
55 pages
Spring Transactions Guide
No ratings yet
Spring Transactions Guide
20 pages
Understanding ZooKeeper for Coordination
No ratings yet
Understanding ZooKeeper for Coordination
28 pages
ZooKeeper: Distributed System Coordination
No ratings yet
ZooKeeper: Distributed System Coordination
4 pages
ZooKeeper: Coordination in Distributed Systems
No ratings yet
ZooKeeper: Coordination in Distributed Systems
75 pages
Zookeeper and Hbase
No ratings yet
Zookeeper and Hbase
43 pages
017.1 - ZooKeeper
No ratings yet
017.1 - ZooKeeper
6 pages
Zookeeper HBase SPARK
No ratings yet
Zookeeper HBase SPARK
25 pages
Zookeeper
No ratings yet
Zookeeper
14 pages
Zookeeper
No ratings yet
Zookeeper
59 pages
Information Systems Development Methodologies Transitions - An Ana PDF
No ratings yet
Information Systems Development Methodologies Transitions - An Ana PDF
23 pages
IT Waterfall Product Management
No ratings yet
IT Waterfall Product Management
101 pages
Improving Waterfall Model in Software Development
No ratings yet
Improving Waterfall Model in Software Development
4 pages
Global Enterprise Agile Transformation Service Market 2019-2026 PDF
No ratings yet
Global Enterprise Agile Transformation Service Market 2019-2026 PDF
298 pages
Waterfall Model Case Study
75% (4)
Waterfall Model Case Study
6 pages
Jounal SDLC Youssef Bassil PDF
No ratings yet
Jounal SDLC Youssef Bassil PDF
7 pages
CAPM Training Summary and Insights
No ratings yet
CAPM Training Summary and Insights
9 pages
Marketing Campaign Financial Overview
No ratings yet
Marketing Campaign Financial Overview
4 pages
Do Project Managers Have Different Perspectives On Project Management?
No ratings yet
Do Project Managers Have Different Perspectives On Project Management?
8 pages
Sector-Specific Market Analysis Report
No ratings yet
Sector-Specific Market Analysis Report
1 page
SIBM Pune Summer Internship Guidelines
No ratings yet
SIBM Pune Summer Internship Guidelines
8 pages
Ricardo Vargas Simplified Pmbok Flow 6ed Color En-A3
100% (2)
Ricardo Vargas Simplified Pmbok Flow 6ed Color En-A3
1 page
AWS Certified Advanced Networking Blueprint
No ratings yet
AWS Certified Advanced Networking Blueprint
3 pages
Overview of Computer Architecture and I/O
No ratings yet
Overview of Computer Architecture and I/O
18 pages
Data Structures and Algorithms Quiz
100% (1)
Data Structures and Algorithms Quiz
11 pages
07 TN - SP005 - E1 - 1 GNGP Interface and GTP Protocol-34
No ratings yet
07 TN - SP005 - E1 - 1 GNGP Interface and GTP Protocol-34
34 pages
RTU560: 23BA22 Board Overview
No ratings yet
RTU560: 23BA22 Board Overview
8 pages
Guide to Backtracking Tor Network
0% (1)
Guide to Backtracking Tor Network
6 pages
Portable Gas Analyzers Overview
No ratings yet
Portable Gas Analyzers Overview
10 pages
Nginx Exploit Documentation PDF
No ratings yet
Nginx Exploit Documentation PDF
5 pages
VL9252 Low Power Vlsi Desing
No ratings yet
VL9252 Low Power Vlsi Desing
7 pages
CYT8 (TX8) - ASK RF Transmitter Module
No ratings yet
CYT8 (TX8) - ASK RF Transmitter Module
4 pages
Future Integrated Modular Avionics For Jet Fighter Mission Computers
100% (4)
Future Integrated Modular Avionics For Jet Fighter Mission Computers
11 pages
Datasheet RMC-1000 v1.0
No ratings yet
Datasheet RMC-1000 v1.0
3 pages
Installation of V6.1 Software
No ratings yet
Installation of V6.1 Software
30 pages
Control of Synchronous Motor
No ratings yet
Control of Synchronous Motor
12 pages
Darlington and Cascode Amplifier Study
100% (1)
Darlington and Cascode Amplifier Study
8 pages
Main Features: Controllers
No ratings yet
Main Features: Controllers
4 pages
WCDMA RBS 6000 Architecture Guide
No ratings yet
WCDMA RBS 6000 Architecture Guide
19 pages
Vmware: Actual Exam Questions
No ratings yet
Vmware: Actual Exam Questions
10 pages
Product Architecture Guide
No ratings yet
Product Architecture Guide
34 pages
HLD Previous
No ratings yet
HLD Previous
67 pages
Ant Bms Installation Manual - 2021
100% (2)
Ant Bms Installation Manual - 2021
18 pages
Assignment 2 C#
No ratings yet
Assignment 2 C#
24 pages
Installing The Contoso Test Dataset
No ratings yet
Installing The Contoso Test Dataset
7 pages
Cardclonnig Guide by TOPBUYER
100% (4)
Cardclonnig Guide by TOPBUYER
4 pages
RPA345 Refinish Brochure
No ratings yet
RPA345 Refinish Brochure
5 pages
Alcohol 120% 2.1.1.611 Crack+Keygen & Serial Key Free Download (Latest)
No ratings yet
Alcohol 120% 2.1.1.611 Crack+Keygen & Serial Key Free Download (Latest)
2 pages
POINT Guard I/O Safety Modules: Installation & User Manual (Catalog Numbers 1734-IB8S, 1734-OB8S)
No ratings yet
POINT Guard I/O Safety Modules: Installation & User Manual (Catalog Numbers 1734-IB8S, 1734-OB8S)
148 pages
AD845
No ratings yet
AD845
13 pages
ICAB Old Sample Question Solution (IT)
No ratings yet
ICAB Old Sample Question Solution (IT)
16 pages
Building High Integrity Applica - John W. McCormick
No ratings yet
Building High Integrity Applica - John W. McCormick
369 pages

ZooKeeper: Distributed Coordination Service

Uploaded by

ZooKeeper: Distributed Coordination Service

Uploaded by

ZooKeeper

• Developed at Yahoo! Research

• Started as sub-project of Hadoop, now a top-level

• Development is driven by application needs

Pig Hive Sqoop

• The topology does not change

• The network is homogeneous

• The bandwidth is infinite

• Today: applications consist of independent programs

• Difficulty: coordination of those independent programs

• Developers have to deal with coordination logic and

ZooKeeper: designed to relieve developers from

one machine (the leader)

The progress of the crawl

Every worker should

• Be general: provide an API to make many services

Slave Slave Slave Slave

+ not a single point of failure anymore

nothing comes back network failure

Sender does not know:

• Dynamic configuration: parameter changes on the fly

• Group membership: who is alive?

• Leader election: who is in charge who is a backup?

• Mutually exclusive access to critical resources (locks)

• Barriers (supersteps in Giraph for instance)

The ZooKeeper API allows us to implement all these coordination

• Writes to ZooKeeper are linearisable

• Clients receive notifications of changes before the

• Data is cached on the client side:

• What if a new leader is elected?

• Server: process providing the ZooKeeper service

• znode: in-memory data node in ZooKeeper,

• Update/write: any operation which modifies the state

• Clients establish a session when connecting to

• znodes can be manipulated by clients through the

• znodes are referred to by UNIX style file system

/app1/p_1 /app1/p_2 /app1/p_3

• znodes map to abstractions of the client

/app1/p_1 /app1/p_2 /app1/p_3

• EPHEMERAL flag: clients create znodes which

• SEQUENTIAL flag: monotonically increasing

/app1_5/p_1 /app1_5/p_2 /app1_5/p_3

• Server notifies the client when the information on the

• Watches are one-time triggers associated with a

• Watch notifications indicate the change, not the new

• Sessions have an associated timeout

• ZooKeeper considers a client faulty if it does not

• Session ends: faulty client or explicitly ended by

• Clients connect to exactly one server to submit

• write requests are processed by an agreement

• void delete(path, version)

• Stat exists(path, watch)

• (data, Stat) getData(path, watch)

• Stat setData(path, data, version)

• String[] getChildren(path, watch)

[configuration stored in /app1/config] /app1 /app2

if the leader dies, elect again (“herd effect”)

• Setup: master commands page-fetching processes

• Main advantage of ZooKeeper:

• Used primitives of ZK: configuration metadata, leader

• The system manages thousands of topics that clients

• The topics are distributed among a set of servers to

• Used primitives of ZK: configuration metadata (to

only write only read

(1)failure & recovery of

• [paper] ZooKeeper: Wait-free coordination for Internet-

• Why do we need it?

• Data model of ZooKeeper: znodes

• Example implementations of different coordination

You might also like